1 Introduction

This paper documents a model of the COVID-19 epidemic in South Africa. Mobility data is used to model the reproduction number of the COVID-19 epidemic over time using a Bayesian hierarchical model. Results are calibrated to excess and reported deaths and not case counts. This is achieved by adapting the work in by Imperial College London researchers [1] for South Africa. Authors from Imperial College London have built similar models for Brazil [2] and the United States [3].

The model is based on data as compiled in [4]. The model uses mobility movement indexes by province produced by Google [5]. Furthermore the model includes a fixed effect for interventions introduced at the start of level 4 lockdown.

The model and report are automatically generated on a regular basis using R [6]. This version contains data available on 11 August 2020.

An online, regularly updated version of this report is available here.

2 Updates

As this paper is updated over time this section will summarise significant changes. The code producing this paper is tracked using git. The git commit hash for this project at the time of generating this paper was f95bd860ce06c9fa7156ef3c52ce6f8f71a919eb.

2020-05-31

  • Document is made available online (pending review).

2020-06-01

2020-06-04

  • Incorporated feedback from reviewer 1.

2020-06-08

2020-06-09

  • Incorporate feedback from reviewer 4.
  • Remove the under review designation. As at this date the model has feedback from 4 reviewers and also comments from others. The author will continue to update this model and seek review as needed.
  • Due to mobility increasing from Level 3 and additional projection scenarios has been introduced involving decreased mobility (back to level 4 levels).
  • Numerous updates on plots including some residual plots.

2020-06-10

  • Remove parks index from average mobility index in line with [7]. Will drop residential index soon too.

2020-06-20

  • Update IFRs for HIV. Possibly need further increases based on research coming out in near future.
  • Remove residential mobility, such that there is only one mobility parameter left.

2020-06-22

  • Add parameter for fixed effects correlated with the start of level 4 lockdown aimed at capturing the effect of mask wearing and workplace screening.

2020-07-04

  • Updated standard deviation of prior for \(\beta\) to reflect [8]. This allows for a greater impact of mask wearing on \(R_{t,m}\).
  • \(\beta_{m}^s\) prior updated to allow increases and decreases in effect of interventions implemented at start of level 4 lockdown.

2020-07-07

  • Major changes to allow for excess deaths exceeding reported COVID-19 deaths.
  • Expanded calibration graphs for more provinces.

2020-07-16

  • Update excess reporting based on [9] data.
  • Increase uncertainty around completeness of reporting factors.

2020-07-26

  • Remove 14-day backtesting to reduce total run time.
  • Further update to excess deaths calculation to give reporting data an additional week of run-off.

3 Methodology

A detailed description of the methodology and assumptions is provided below below. The key features of the approach employed are summarised here:

  • It uses a Bayesian Hierarchical Model to calibrate model parameters based on observed death data and prior assumptions.
  • Changes in the reproductive number are linked to mobility data [5] as well as introduction of other non-pharmaceutical interventions (NPIs).
  • The reproductive number over time is used to model the number of infections occurring over time.
  • The model uses population weighted infection fatality ratio (IFRs) with some noise added as well as assumptions about the time it takes to die to model deaths from infections.
  • There is a single combined model for all provinces. This model shares information between provinces, but province specific effects are also allowed.
  • The model does not allow for most interventions that are not observable via mobility changes, for example this includes testing and contact tracing. The model includes the effect of the average level of these interventions. Social distancing, for example, may have reduced the reproduction number throughout the past period. However, the model will not take account to changes in these interventions. These interventions may also put a damper on increases in the reproductive number as mobility increases.
  • The model is fitted such that it uses the prior assumptions about various parameters as well as the reported death data in such a way to provide an updated parameter distribution and outcome distribution that reflect both the prior assumptions as well as the reported death data.

4 Mobility

4.1 National Mobility Data

The national mobility data from [5] is plotted below. The model uses the indexes at a provincial level but here the national indexes are plotted for convenience. Clear trends are observable:

  • Mobility generally reduces before the lockdown on 27 March.
  • There is an increase in mobility in the days just before the lockdown. In particular the Grocery & Pharmacy and Retail indexes. Perhaps an indication of pre-lockdown “panic buying”?
  • Mobility is relatively stable during level 5 lockdown at low levels.
  • Mobility increases when level 4 starts.

The chart below summarises the average mobility and residential indexes used in the model. This is the average of mobility indicators excluding parks and residential. This follows [7].

There are some risks of the mobility data above as it may be biased to the Android operating system, people with smartphones and people with enough data on their smartphone to share their location.

This bias does not really represent a problem though unless it changes over time because then the model would not be able to produce accurate projections. There is a risk that mobility changes disproportionally between users contributing data to these indexes and those not. It would not seem reasonable to expect a major change in the bias over time (unless the calculation is somehow compromised). Provincial biases may exist, but the model also allows the provinces to exhibit provincial specific mobility effects for both parameters.

4.2 Provincial Mobility Data

The average mobility (excluding residential & park indexes) is plotted below for the various provinces. From this plot it seems apparent that Western Cape mobility has reduced the most during lockdown and is, perhaps, increasing slower.

5 Calibration

The sections below show how well the model is reproducing past death data. As the model is fitted to past death data we would want to see how well it’s reproducing such data. Each section covers a province.

Three panels are plotted for each province:

  1. The first panel shows the modelled daily number of infections (blue) compared to confirmed case counts (brown) as reported for the province. Note that this model does not calibrate to case data, but this data is shown for reference. We expect the model to produce infections exceeding confirmed cases as testing is likely only finding a fraction of the true infections. For example, New York City was estimated to have as many as 2.7m total infections as measured by antibody testing by April 2020 while only have 263 000 confirmed cases at that time [10].
  2. The second shows the daily count of deaths reported in the province (brown) and the modelled deaths in blue. A good model would produce death forecasts such that the actual reported past deaths would appear in the confidence intervals of the prediction.
  3. The third panel shows the estimates for reproductive number (\(R_{t,m}\)) over time (\(t\)) for each province (\(m\)). These values are generating the infections in the model which result in deaths. Note that due to the delay of deaths following infections the last week or two of \(R_{t,m}\) values already present an extrapolation from the data. Deaths as a result of these recent infections have generally not occurred as yet.

In all the charts the darker shaded area represents a confidence interval of 50% and the lighter shaded area represents a confidence interval of 95%.

In general, it is noted:

  • The model appears to provide a reasonable fit to daily number of deaths reported (where there is enough data).
  • \(R_{t,m}\) dropped significantly from the declaration of state of disaster, closure of schools and the implementation of lockdown.
  • \(R_{t,m}\) picked up in the days ahead of the start of Level 5 lockdown, presumably as people prepared for lockdown.
  • \(R_{t,m}\) dropped significantly during Level 5 lockdown. For most provinces it did not drop below 1. \(R_{t,m}\) below 1 would indicate a epidemic where each infection is generating less than 1 further infection. This would be an epidemic that is slowing down and should \(R_{t,m}\) be sustained below 1 such an epidemic would stop.
  • \(R_{t,m}\) is modelled to move higher after the start of Level 4 lockdown and further with the start of Level 3.

5.1 Western Cape

The Western Cape has the most reported deaths of all provinces and hence the most data to calibrate. The modelled infections are plotted below. It’s clear that modelled infections are far outpacing reported cases (brown).

Over the last 14 days it would appear that the Western Cape only tested 4.7% of all new infections. This figure seems quite low, but it seems that the growth in infection in the last number of weeks coincided with a large backlog of tests as well as a change in Western Cape testing policy. The Western Cape is limiting tests due to shortages to only higher risk groups [11].

The reported deaths (in brown) and modelled reported deaths (in blue) are plotted below. The model appears to be reasonable given the data. The data does seem quite variable from day to day which may be perhaps due to data processing delays causing clumping of reported deaths.

Below we plot the residuals by day. The projections for the Western Cape model are exceeding reported deaths in June. This has improved since the introduction of the mask wearing variable, but still is not great.

The \(R_{t,m}\) has actually reduced at the start of level 4 lockdown. Western Cape \(R_{t,m}\) has also been reducing over time to present levels.

5.2 Eastern Cape

Below modelled infections are plotted compared to confirmed cases. Over the last 14 days it would appear that 1.1% of all new infections were tested.

The reported deaths (in brown) and modelled reported deaths (in blue) are plotted below. Model for Eastern Cape also appears slightly on the high side but less so that the Western Cape.

Below we plot the residuals by day.

Below the \(R_{t,m}\) is plotted:

5.3 Gauteng

Recent spikes of deaths seem to be resulting in an increasing \(R_{t,m}\). Over the last 14 days it would appear that 1.1% of all new infections were tested.

The reported deaths (in brown) and modelled reported deaths (in blue) are plotted below.

Below we plot the residuals by day.

Below the \(R_{t,m}\) is plotted:

5.4 KwaZulu-Natal

Recent spikes of deaths seem to be resulting in an increasing \(R_{t,m}\). Over the last 14 days it would appear that 1.7% of all new infections were tested.

The reported deaths (in brown) and modelled reported deaths (in blue) are plotted below.

Below we plot the residuals by day.

Below the \(R_{t,m}\) is plotted:

5.5 Other Provinces

The other provinces have limited data.

Over the last 14 days it would appear that 0.8% of all new infections were tested.

The reported deaths (in brown) and modelled reported deaths (in blue) are plotted below.

Below we plot the residuals by day.

Below the \(R_{t,m}\) is plotted:

6 Parameter Estimates

To understand the net parameter estimates for \(\alpha\) and \(\alpha_{m}^s\) and their impact on \(R_{t,m}\) we plot the percentage reduction in \(R_{t,m}\) assuming the particular index is 1 (representing a 100% reduction in average mobility).

This is equivalent to plotting \(1-2\cdot\phi^{-1}(-(\alpha+\alpha_{m}^s))\) for a particular province \(m\). We also plot \(1-2\cdot\phi^{-1}(-(\beta+\beta_{m}^s))\).

Confidence intervals are wide, though the average mobility index shows a big impact on \(R_{t,m}\) (assuming 100% reduction in mobility). Interventions introduced with level 4 lockdown also seem to be reducing \(R_{t,m}\) on average between 10% and 30%.

7 Reproduction Number Estimates

7.1 Initial Reproduction Number

Estimates for \(R_{0,m}\) for each province are plotted below. It is clear that the posterior estimates for \(R_{0,m}\) is not heavily influenced by the data. This is probably due to the relatively early lockdown implemented in South Africa. There were probably not enough deaths that resulted from infection prior to lockdown to develop an independent estimate for each province of \(R_{0,m}\).

7.2 Current Reproduction Number as at 11 August 2020

Current estimates and 95% confidence intervals for \(R_{t,m}\) (current reproduction number) are plotted below for each province. It’s clear that currently the values of \(R_{t,m}\) for some provinces now include 1 in the CIs. A value below 1 would indicate an epidemic that is slowing while a value above 1 indicates an epidemic that is growing. It is clear that the spread of the epidemic is somewhat slowed compared to the initial \(R_{0,m}\).

The wide confidence intervals would indicate that we need to wait for the epidemic to further develop to include more data in our models. The confidence interval for Western Cape is narrowing already.

8 Attack Rate as at 11 August 2020

The estimated attack rate (with 95% confidence intervals) is tabulated below. This is the proportion of the population infected to date. This figure has to be estimated, because many that are infected experience no or mild symptoms, thus they may not seek medical advice and hence will never be tested.

Western Cape has the highest estimated prevalence to date, but with fairly wide confidence intervals. Eastern Cape has the second highest prevalence.

The Western Cape figures may seem high but a recent presentation indicate positive rates from 20% to 40% during the final weeks of May 2020 depending on sub-district and whether it’s private or public health facilities doing the tests [12]. The public sector testing does seem closer to a 30% proportion positive at present from the graphs in that presentation. This information is not equivalent to an attack rate, but may be indicative of high numbers of infections being modelled here, especially those that have occurred relatively recently.

Province Attack Rate
EC 41.68% [25.53%-59.67%]
GP 53.90% [33.05%-74.17%]
KZN 34.93% [18.85%-56.63%]
WC 17.68% [12.19%-25.54%]
OTH 35.49% [14.29%-64.99%]
South Africa 38.77% [27.94%-51.04%]

9 Projections

One of the reasons we build models is so that we can make sense of the future or indeed the past. We can project forward models to assess the impact of varying assumptions on future outcomes. This gives us a sense of how changes in actions may impact the future. I.e. it allows us to answer “what if” questions. Note however that in projecting the future we are extrapolating, and due care needs to be taken. There are numerous limitations to this model and these projections dicussed below but the author is also of the opinion that the projections add value in that they indicate a significant range across the scenarios projected and in such a manner inform discussion.

All models are wrong but some are useful - George Box [13]

An incorrect model can be useful because it enables a better understanding of the model and the phenomena being modelled. This section should be used with caution because, as we show in calibration and backtesting, the model seems to be projecting higher deaths than observed for the Western Cape and Eastern Cape.

Note detailed projection output can be found here.

9.1 No change in Mobility

The first projections holds mobility constant at current levels which would be associated with level 3 of lockdown. Level 4 interventions are left intact.

The result of this scenario as at 31 December 2020 is tabulated below.

Province Attack Rate Reported Deaths Peak Daily Reported Deaths Peak Reported Date Deaths Peak Daily Deaths Peak Date
EC 59.01% [46.14%-68.82%] 6 356 [ 3 712-10 316] 80 2020-08-23 25 218 [ 18 737- 31 985] 319 2020-08-21
GP 71.54% [62.04%-79.30%] 11 005 [ 5 013-19 370] 189 2020-08-23 46 236 [ 35 892- 57 245] 789 2020-08-21
KZN 69.91% [60.64%-77.84%] 10 906 [ 5 836-17 275] 190 2020-09-05 37 476 [ 29 531- 45 940] 654 2020-09-04
WC 26.80% [16.99%-38.71%] 6 875 [ 4 622-10 093] 45 2020-08-01 8 780 [ 5 725- 12 792] 58 2020-07-31
OTH 77.24% [67.73%-84.87%] 11 074 [ 3 132-23 788] 223 2020-09-08 72 099 [ 56 348- 88 988] 1 425 2020-09-06
South Africa 66.40% [60.89%-71.43%] 46 216 [32 823-63 051] 684 2020-09-03 189 809 [166 671-213 803] 3 067 2020-09-02

9.2 Increase in mobility

This scenario assumes future mobility half-way between current mobility levels (associated with level 3 of lockdown) and normal levels. Level 4 interventions are left intact. The attack rate and deaths after increasing mobility are tabulated below (as at 31 December 2020).

Province Attack Rate Reported Deaths Peak Daily Reported Deaths Peak Reported Date Deaths Peak Daily Deaths Peak Date
EC 61.75% [45.33%-72.84%] 6 667 [ 3 774-11 121] 81 2020-08-25 26 396 [ 18 693- 34 093] 320 2020-08-22
GP 74.65% [62.83%-82.46%] 11 521 [ 5 116-20 565] 191 2020-08-25 48 261 [ 36 880- 60 184] 792 2020-08-22
KZN 75.43% [63.26%-84.76%] 11 775 [ 6 231-18 826] 214 2020-09-08 40 440 [ 31 211- 50 085] 729 2020-09-07
WC 29.83% [15.62%-50.93%] 7 628 [ 4 331-13 620] 45 2020-08-01 9 738 [ 5 321- 16 934] 58 2020-07-31
OTH 79.39% [68.89%-87.36%] 11 402 [ 3 185-24 601] 236 2020-09-09 74 116 [ 57 623- 91 664] 1 496 2020-09-07
South Africa 69.62% [62.69%-76.14%] 48 992 [34 145-67 407] 729 2020-09-05 198 951 [172 662-226 431] 3 238 2020-09-04

Based on the above an increase mobility could mean roughly 9 000 more deaths by the end of the year.

9.3 Decreased mobility

This scenario assumes future mobility at average levels seen during level 4 lockdown. This may occur either through actual reinstatement of level 4, or perhaps can be considered as a possibility if increased mobility does not result in as much increase in the reproductive number.

The attack rate and deaths after decreased mobility are tabulated below (as at 31 December 2020).

Province Attack Rate Reported Deaths Peak Daily Reported Deaths Peak Reported Date Deaths Peak Daily Deaths Peak Date
EC 53.99% [45.00%-64.56%] 5 777 [ 3 542- 8 858] 80 2020-08-21 23 059 [ 17 918- 29 032] 318 2020-08-20
GP 66.90% [57.91%-76.90%] 10 212 [ 4 882-17 312] 187 2020-08-22 43 203 [ 33 739- 53 725] 786 2020-08-20
KZN 58.38% [49.49%-68.06%] 9 059 [ 5 065-14 017] 164 2020-08-30 31 275 [ 24 481- 38 594] 574 2020-08-29
WC 25.33% [18.27%-33.44%] 6 509 [ 4 929- 8 620] 45 2020-08-01 8 311 [ 6 135- 11 117] 58 2020-07-31
OTH 62.98% [51.54%-74.74%] 8 807 [ 2 863-17 983] 170 2020-09-02 58 726 [ 44 571- 74 335] 1 163 2020-08-31
South Africa 57.70% [52.69%-62.86%] 40 364 [29 638-53 355] 622 2020-08-28 164 574 [145 108-185 734] 2 798 2020-08-27

Based on the above a decrease in mobility could mean roughly 25 000 fewer deaths by the end of the year.

9.4 Western Cape

Below we plot projections for the Western Cape.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

9.5 Eastern Cape

Below we plot the projections for Eastern Cape.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

9.6 Gauteng

Below we plot the projections for Gauteng.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

9.7 KwaZulu-Natal

Below we plot the projections for KwaZulu-Natal.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

9.8 Other provinces

Below we plot the projections for the provinces other than, Western Cape, Eastern Cape, Gauteng and KwaZuly-Natal.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

9.9 South Africa

We plot below the results for South Africa as a whole. This is the sum of the provincial projections.

The next 30 days the attack rate and deaths are increasing rapidly. The model predicts that within 30 days it is possible that deaths will exceed 500 per day and that more than 10% of the South African population may be infected.

The attack rate and deaths over a longer period for both the constant mobility and increased mobility scenarios are plotted below:

10 Backtesting

In the sections below the model is backtested. The backtesting assumes perfect knowledge of the mobility indexes to date, but tests the models with the 28 most recent days of death data excluded. The test is to see how well the model predicts forward over time.

Below we plot the results backtesting with the most recent data excluded as indicated by the black dotted line (28 days).

Below we plot the residuals for the above.

11 Discussion

11.1 Limitations

This analysis has various limitations:

  • The models are based on somewhat limited data for South Africa. In particular the effects of lockdown in some of the provinces may be simple “extrapolations” using the fixed effect (\(\alpha\)) and shared prior of \(R_{0,m}\) from other provinces. In particular the experience in Western Cape may be influencing the results of other provinces. As these fits are updated with new data though the models should be more accurate for each province resulting in better estimates given.
  • The models are simplistic high-level single population models for each province. It contains no differentiation by age groups or any further details. This is required to make the hierarchical models manageable. It seems still to provide useful information on potential movements in the epidemic. However, this homogeneity may result in overestimation of attack rates and hence deaths.
  • Similar to the above point various groups even of the same age may have varying susceptibility to the virus. Varying susceptibility may reduce the effective herd immunity threshold of the disease.
  • Again extending the point above varying levels of susceptibility may also be correlated with varying
  • The models do not take account of changes in many interventions that do not impact general mobility such as social distancing, handwashing and testing, tracing and isolation. The models do take account of the average impact of these interventions over time, but changes in these interventions will not be taken account of.
  • The paper does take account of mask-wearing, workplace screening and other NPIs introduced with the start of level 4 lockdown. This is done approximately and only measures the implementation of the regulation and not actual compliance with the regulations. The effect may also not be capturing only this effect and may include other effects that occurred at the same time. Similarly these variable may not be fully capturing increased compliance with these regulations over time.
  • Google Mobility data may not effectively cover the South African population. It’s unclear how uniform the coverage is in say lower income populations who may have limited data and hence not contribute significantly to the mobility indexes. Smartphone penetration in South Africa is estimated to exceed 90% [14] though. So the representative nature may be driven mainly by access to data as well as how many people have enabled the feature used to generate the data (Location History). Changes in mobility affecting different segments of the population differently may result in unexpected results.
  • IFR assumptions may be wrong:
    • It’s based on Chinese data and may therefore be higher or lower in SA. The IFR employed though has been found consistent with various sources [15].
    • It does not consider health system capacity. In adverse scenarios the IFR is likely to increase if health system capacity is exceeded.
  • The model assumes that all COVID-19 deaths are captured and reported via NICD into [4].
  • The model allows for estimates of underreporting using [16]. This may be in accurate or change over time.
  • The model assumes that deaths occur on the day they were reported. Based on discussions with experts it appears that deaths are reported late.
  • Backtesting has not been done over periods longer than 28 days.

11.2 Impact of Interventions

From the results it’s clear that \(R_{t,m}\) in all provinces has reduced from the starting values of \(R_{0,m}\) and this has slowed the spread of the epidemic in South Africa saving lives. However, in Europe the lockdowns have been able to reduce \(R_{t}\) below 1 for various countries [17]. As shown here, South Africa’s lockdown and other interventions have not been as successful as in most the European countries shown in [17].

11.3 Lockdown - Level 4

Mobility has increased somewhat following the commencement of level 4 lockdowns. This results in an increasing the \(R_{t,m}\) in various provinces since the start of May resulting in a corresponding increase in deaths two to four weeks later.

11.4 Lockdown - Level 3

Mobility has increased further during level 3 lockdown. This results in an increasing the \(R_{t,m}\) in various provinces since the start of May. The death data corresponding to this increased \(R_{t,m}\) is not significant as yet and we will soon see whether this model is handling this accurately, or if potentially through other interventions the Level 3 is not increasing \(R_{t,m}\). Other mitigations may dampen the effect of increasing mobility.

11.5 Further relaxation of lockdown

Projections on current mobility levels would already result in significant peaks in deaths. Further relaxation of lockdowns will result in increases in mobility, which would worsen the reproduction numbers.

Based on the modelling it is expected to result in roughly 9 000 more deaths by the end of the year. This ignores other impacts such as ICU availability and the impact on deaths from other causes which would be expected to increase these figures, but also the impact the homogeneity assumption which might be exacerbating this number.

11.6 Reducing Mobility

Reducing mobility could result in roughly -25 000 fewer deaths by the end of the year. This could happen via reverting lockdown to level 4 or, possibly, if other non-mobility interventions act as a dampener on any increases in the reproduction number. Again, this ignores other impacts such as ICU availability and the impact on deaths from other causes.

11.7 Differences between provinces

The large differences between the various provinces was surprising. Given the high prevalence in the Western Cape it may be prudent to enforce some screening or travel restrictions out of the province.

11.8 Relevance of IFR

The IFR is not treated as a parameter but as a constant with random noise. Changes to the IFR will change the modelled infections that correlate with the observed deaths. Sensitivity to the IFR could be modelled.

11.9 Using mobility data

Using mobility data is useful to not only measure the impact of government interventions but also include the societal response to those interventions (as they affect mobility). This means that changes in the reaction to new regulations can be modelled. It may also be useful going forward as many new regulations are introduced possibly at a provincial level to summarise the impact of interventions numerically.

11.10 Untangling mobility and other interventions

This paper projects mobility changes forward. We can see that as the model assigns some value to initiatives introduced at the start of level 4 around mask wearing and workplace screening. These are reducing the impact of increasing mobility such that in the Western Cape the \(R_{t,m}\) actually reduced in level 4 compared to level 5.

It is difficult to know how this will continue forward especially given that the projections for the Western Cape is still exceeding the observations both in calibration and in backtesting.

11.11 Health system capacity

When health system capacity is under ressure in various provinces it would seem likely that the mortality rate of infected individuals in those provinces would increase. This model needs to take account of that for two reasons:

  1. Given the relationships between deaths and infections in this model as sudden increase in deaths due to capacity constraints would incorrectly be treated as a rise in infections.
  2. To project forward accurately the expected deaths this model needs to allow for future capacity constraints as well.

The model does not currently do this.

11.12 Backtesting & poor fit to Western Cape (and Eastern Cape)

Backtesting shows reasonable performance of the model at least over a 28-day time period. There is some bias in which appears to be over-predicting reported deaths.

Some possible reasons are:

  • Under- or late-reporting increasing in recent week or two.
  • Increased mobility being muted by other interventions increasing.
  • Some form of heterogeneous susceptibility kicking in.
  • Over-estimating of the epidemic to date due to under-estimated IFRs.
  • Not enough data yet to adjust the curve for the latest emerging experience.

These will be investigated to the extent possible.

11.13 Improving the Model

The model is still projecting too high deaths for Western Cape and Eastern Cape in calibration and/or backtesting.

The author intends to investigate the introduction of autoregressive process as per [3] to model further residuals that cannot seem to be captures by existing parameters. These may be reflecting other NPIs, general population awareness and behaviour adjustment or other unmodelled disease effects.

Other areas to investigate in this regard:

  • The potential for increasing reporting delays in a rising epidemic and the impact here.
  • The potential for changes in underreporting and it’s impact here.
  • Heterogeneous susceptibility or lives that may be not susceptible may impact results here.

11.14 Further Work

Along with the above, the author intends to investigate the following:

  • Search and include further indexes that may track other interventions to model their effect.
  • Analyse the sensitivity of results to changes in IFR.
  • Consider spreading deaths backwards based on estimated reporting delays.
  • Analyse the impact of different assumptions of distribution of deaths on the sensitivity of projections.
  • Consider incorporating health system constraints and it’s impact on IFRs.
  • Consider adjusting completeness of reporting over time.

12 Detailed Assumptions and Methodology

The model assumes that current reproduction number, \(R_{t,m}\), is a function of the initial reproduction number, \(R_{0,m}\), and mobility changes over time. It then calculates infections as a function of \(R_{t,m}\) over time, and then, using these infections calculates deaths from the infections based on a distribution of time to death. Various prior distributions are assumed. The model structure is identical to that in [1] but is briefly documented below. The parameters are estimated jointly using a single hierarchical model covering all provinces. This means that data in different provinces are combined to inform all parameters in the model. As per [1], fitting was done in R using Stan with an adaptive Hamiltonian Monte Carlo sampler.

12.1 The Model

The model is documented here but is as per in [1], apart from the simplification of mobility indexes to a single index. The model assumes a base reproduction number (\(R_{0,m}\)) for each province (\(m\)) and then it models future values of the reproduction number using mobility indexes as follows:

\(R_{t,m}=R_{0,m}\cdot2\cdot\phi^{-1}(-(\alpha+\alpha_{m}^s)I_{t,m}^{\alpha}-(\beta+\beta_{m}^s)I_{t,m}^{\beta})\)

Here:

  • \(t\) is the day.
  • \(\phi^{-1}\) is the inverse logit function.
  • \(I_{t,m}^{\alpha}\) is the average mobility indexes from Google Mobility [5] data for each province over time. Typical movements are presented by 0 and atypically movements are represented by increases or decreases to the above.
  • \(\alpha\) is the effect of \(I_{t,m}^{\alpha}\) on \(R_{t,m}\) independent of province.
  • \(\alpha_{m}^s\) is the province specific effect of \(I_{t,m}^{\alpha}\) on \(R_{t,m}\).
  • \(I_{t,m}^{\beta}\) is the effect of other interventions introduced with level 4 lockdown. \(I_{t,m}^{\beta}\) is 0 before the start of level 4 and remains 1 after the start of level 4 lockdown.
  • \(\beta\) is the effect of \(I_{t,m}^{\beta}\) on \(R_{t,m}\) independent of province.
  • \(\beta_{m}^s\) is the province specific effect of \(I_{t,m}^{\beta}\) on \(R_{t,m}\).

Infections are modelled as:

\(c_{t,m}=S_{t,m}R_{t,m}\sum_{\tau=0}^{t-1}c_{\tau,m}g_{t-\tau}\) where \(S_{t,m}=1 - \frac{\sum_{i=0}^{t-1}c_{i,m}}{N_{m}}\).

Infections, \(c_{t,m}\) at time \(t\) are a function of proportion of population not yet infected (\(S_{t,m}\)), the reproduction number (\(R_{t,m}\)) and infections prior to that \(c_{\tau,m}\) as well as an infectiousness curve \(g_{t-\tau}\). \(N_m\) is the population in province \(m\).

Deaths, \(d_{t,m}\) are modelled as:

\(d_{t,m}=ifr_{m}^*\sum_{\tau=0}^{t-1}c_{\tau,m}\pi_{t-\tau}\)

Here:

  • \(ifr_{m}^*\) is the average infection fatality rate (IFR) in the province (with random noise added).
  • \(\pi_{t-\tau}\) is a distribution of time to death since infection.

We model completeness of reporting as random noise from a Beta distribution:

\(\psi_m \sim B(\mu_mv_m,(1-\mu_m)v_m)\)

Here:

Reported deaths, \(d'_{t,m}\) are then:

\(d'_{t,m}=d_{t,m}\psi_m\)

Reported daily deaths \(D'_{t,m}\) are assumed to have the following distribution:

\(D'_{t,m} \sim Negative\ Binomial(d'_{t,m},d'_{t,m}+ \frac{{d'}_{t,m}^2}{\phi})\)

If \(Y \sim N(\mu,\sigma)\) then we define \(N^{+}\) to mean the distribution of \(|Y| \sim N^{+}(\mu,\sigma)\).

We assume that:

\(\phi \sim N^{+}(0,5)\)

Then: \(d'_{t,m}=E(D'_{t,m})\)

12.2 Prior Assumptions and Random Noise

The following assumptions are taken as is from [1] except for where indicated.

We add random noise to the IFR as follows:

\(ifr_{m}^*=ifr_{m}\cdot N(1,0.1)\)

\(\alpha\) & $$ is normally distributed with a 0 mean:

\(\alpha \sim N(0,0.5)\)

\(\beta \sim N^{+}(0,1)\)

The above distribution for \(\beta\) has been increased to reflect [8] which indicated an effect of mask wearing. In this paper a relative risk of 0.56 is indicated for mask wearing in non-health-care settings. The above prior would indicate a 0.62 expected relative risk. \(\beta\) is not a parameter in the model used in [1].

Then for the provincial specific index effects we use:

\(\alpha_{m}^s \sim N(0,\gamma^{\alpha})\) with \(\gamma^{\alpha} \sim N^{+}(0,0.5)\)

\(\beta_{m}^s \sim N(0,\gamma^{\beta})\) with \(\gamma^{\beta} \sim N^{+}(0,0.5)\)

\(\beta_{m}^s\) is a parameter in the model used in [1].

The \(R_{0,m}\) are defined to be distributed normally as follows:

\(R_{0,m} \sim N(3.28,\kappa)\) with \(\kappa \sim N^{+}(0,0.5)\)

Infectiousness follows this distribution:

\(g \sim Gamma(6.5,0.62)\)

Time to death follows this distribution:

\(\pi \sim Gamma(5.1,0.86)+Gamma(17.8,0.45)\)

The distribution of time to death is plotted below.

The above implies an average time to death of 23 days.

12.3 Death and Confirmed Case Data

Death and case data were used from [4]. This data set contains, amongst other, provincial case and deaths data digitised mainly from daily tweets by National Institute of Communicable Diseases (NICD).

We note again that the model calibrates to only the deaths. The reason for not calibrating to confirmed cases is that the bias in the testing is unknown. The degree to which testing has focussed on symptomatic and people seeking medical treatment or hospital treatment is unknown and could have changed over time. This would present a biased estimate and would require adjustment in this model.

Based on limited death data, provinces were aggregated as follows:

  • EC – Eastern Cape
  • GP – Gauteng
  • KZN – KwaZulu Natal
  • WC – Western Cape
  • OTH (all other provinces)

12.4 Infection Fatality Rates (IFR)

The IFR (\(ifr_{m}\)) for each province was calculated using the output of the squire R package [18]. It produces an age-specific infection attack rates (IAR), infections and deaths. The per age band IFR were used together with the per age band IAR and these were applied to provincial populations [19]. The IFRs from squire package are based on [20], [21] and [18].

The projection was done doing the default parameters for South Africa and the resultant attack rate (\(a_{x}\)) and IFR (\(ifr_{x}\)) for each 5-year age band was obtained (\(x\)).

Additionally HIV prevalence by age-band and province, \(i_{x,m}^{HIV}\) was taken into account from [22] as well as results from [12]. In [12] it is shown that lives with HIV have higher COVID-19 mortality once infected. Based on the these results we assume three times COVID-19 mortality below 55 and double mortality from age 55 upwards (\(m_x^{HIV}\)).

These \(a_{x}\) and \(ifr_{x}\) and HIV adjustments were then applied to the South African population per province and age band as per [19] and weighted IFRs calculated as follows.

\(ifr_{m}=\frac{\sum_{x}N_{x,m} \cdot a_{x} \cdot ifr_{x}( (1-i_{x,m}^{HIV}) +i_{x,m}^{HIV} \cdot m_x^{HIV})}{\sum_{x}N_{x,m}}\)

Here \(N_{x,m}\) is the population in a particular province (\(m\)) and age band (\(x\)).

Below we tabulate the resultant \(ifr_{m}\):

Province IFR
EC 0.64%
GP 0.44%
KZN 0.47%
WC 0.5%
OTH 0.5%
South Africa 0.49%

The differences between provinces reflect the different age profiles in those provinces as per [19]. This seems low compared to the IFRs in [15], but may be reasonable given the younger profile of the South African population.

12.5 Completeness of Reporting of Deaths

Based excess deaths estimated in [23], [24], [9], [25] and [16] and reported deaths in [4] one can estimate the completeness of reporting. Below excess deaths as derived in [16] over 6 May 2020 to 21 July 2020 is compared with reported deaths for the same period (per [4]). The assumptions is that 90% of excess deaths are COVID-19 related.

It’s also assumed that the reported deaths are lagging by 1 week and allow an additional week of reported deaths in the numerator when measuring completeness. Additionally the excess deaths are, for some provinces, measured over periods starting after 6 May 2020. Despite this all reported deaths from 6 May 2020 onwards are included below.

Province Excess Deaths Reported Deaths COVID-19 Excess Completeness
EC 7597.414 1832 6837.673 0.27
GP 8269.163 2268 7442.247 0.30
KZN 3901.195 976 3511.075 0.28
WC 4461.716 3245 4015.545 0.81
OTH 4315.003 563 3883.502 0.14

12.6 Mobility Indexes

Google has released aggregated mobility data for various countries and for sub-regions in those countries. These data is generated from devices that have enabled the Location History in Google Maps. This feature is available both on Android and iOS devices but is off by default.

For South Africa, these data contain the mobility indexes for each province. These are described in [5] as follows:

  • Grocery & pharmacy: Mobility trends for places like grocery markets, food warehouses, farmers markets, specialty food shops, drug stores, and pharmacies.
  • Parks: Mobility trends for places like local parks, national parks, public beaches, marinas, dog parks, plazas, and public gardens.
  • Transit stations: Mobility trends for places like public transport hubs such as subway, bus, and train stations.
  • Retail & recreation: Mobility trends for places like restaurants, cafes, shopping centres, theme parks, museums, libraries, and movie theatres.
  • Residential: Mobility trends for places of residence.
  • Workplaces: Mobility trends for places of work.

These measure relative changes in mobility in above dimensions relative to a baseline established before the epidemic. For example, -30% implies a 30% reduction in mobility from pre-COVID-19 mobility.

As per [7] these data were combined into an average mobility index for each province which was an average of all mobility indexes excluding:

  • Residential
  • Parks

A residential index is also included.

In [1] three indexes were used (Residential, Transit and the rest). This was reduced for this paper due to limited data.

We calculated indexes for OTH by weighting the individual provinces by population.

13 Author

This report was prepared by Louis Rossouw. Please get in contact with Louis Rossouw if you have comments or wish to receive this regularly.

Louis Rossouw
Head of Research & Analytics
Gen Re | Life/Health Canada, South Africa, Australia, NZ, UK & Ireland
Email: LRossouw@GenRe.com Mobile: +27 71 355 2550

The views in this document represents that of the author and may not represent those of Gen Re. Also note that given the significant uncertainty involved with the parameters, data and methodology care should be taken with these numbers and any use of these numbers.

References

[1] M. Vollmer et al., “Report 20: A sub-national analysis of the rate of transmission of COVID-19 in Italy,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-20-italy/

[2] T. Mellan et al., “Report 21: Estimating COVID-19 cases and reproduction number in Brazil,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-21-brazil/

[3] H. Unwin et al., “Report 23: State-level tracking of COVID-19 in the United States,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-23-united-states/

[4] V. Marivate et al., “Coronavirus disease (COVID-19) case data - South Africa.” Zenodo, 2020 [Online]. Available: https://zenodo.org/record/3888499

[5] Google LLC, “Google COVID-19 community mobility reports.” 2020 [Online]. Available: https://www.google.com/covid19/mobility/

[6] R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2019 [Online]. Available: https://www.R-project.org/

[7] P. Nouvellet et al., “Report 26: Reduction in mobility and COVID-19 transmission,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-26-mobility-transmission/

[8] D. K. Chu et al., “Physical distancing, face masks, and eye protection to prevent person-to-person transmission of SARS-CoV-2 and COVID-19: A systematic review and meta-analysis,” The Lancet, vol. 395, no. 10242, pp. 1973–1987, Jun. 2020, doi: 10.1016/S0140-6736(20)31142-9. [Online]. Available: https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31142-9/fulltext

[9] D. Bradshaw, R. Laubscher, R. Dorrington, P. Groenewald, and T. Moultrie, “Report on weekly deaths in South Africa 1 January - 7 July 2020 (Week 27),” Burden of Disease Research Unit, South African Medical Research Council [Online]. Available: https://www.samrc.ac.za/sites/default/files/files/2020-07-15/WeeklyDeaths7July2020_0.pdf

[10] J. D. Goodman and M. Rothfield, “1 in 5 New Yorkers may have had Covid-19, antibody tests suggest,” New York Times [Online]. Available: https://www.nytimes.com/2020/04/23/nyregion/coronavirus-antibodies-test-ny.html

[11] A. Winde, “Update on the coronavirus by Premier Alan Winde.” Western Cape Government [Online]. Available: https://coronavirus.westerncape.gov.za/news/update-coronavirus-premier-alan-winde-6-june

[12] M.-A. Davies, “Brief update on surveilance of COVID-19 infections in the Western Cape.” 03-Jun-2020. [Online]. Available: http://www.medicine.uct.ac.za/covid19-echo-clinic

[13] G. E. P. Box, “Robustness in the strategy of scientific model building,” in Robustness in statistics, Elsevier, 1979, pp. 201–236 [Online]. Available: https://doi.org/10.1016/b978-0-12-438150-6.50018-2

[14] Independent Communications Authority of South Africa, “The state of the ICT sector report in South Africa,” 2020 [Online]. Available: https://www.icasa.org.za/legislation-and-regulations/state-of-the-ict-sector-in-south-africa-2020-report

[15] G. Meyerowitz-Katz and L. Merone, “A systematic review and meta-analysis of published research data on COVID-19 infection-fatality rates” [Online]. Available: https://www.medrxiv.org/content/10.1101/2020.05.03.20089854v3

[16] D. Bradshaw, R. Laubscher, R. Dorrington, P. Groenewald, and T. Moultrie, “Report on weekly deaths in South Africa 1 January - 21 July 2020 (Week 29),” Burden of Disease Research Unit, South African Medical Research Council [Online]. Available: https://www.samrc.ac.za/sites/default/files/files/2020-07-29/WeeklyDeaths21July2020.pdf

[17] S. Flaxman et al., “Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/

[18] P. Walker et al., “Report 12: The global impact of COVID-19 and strategies for mitigation and suppression,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-12-global-impact-covid-19/

[19] Statistics South Africa, “Mid-year population estimates 2019,” Republic of South Africa, 2019 [Online]. Available: https://www.statssa.gov.za/publications/P0302/P03022019.pdf

[20] R. Verity et al., “Estimates of the severity of coronavirus disease 2019: A model-based analysis,” The Lancet Infectious Diseases, vol. 20, no. 6, pp. 669–677, Jun. 2020, doi: 10.1016/s1473-3099(20)30243-7. [Online]. Available: https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30243-7/fulltext

[21] N. Ferguson et al., “Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand,” Imperial College London, 2020 [Online]. Available: https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-9-impact-of-npis-on-covid-19/

[22] L. Johnson, R. Dorrington, and H. Moolla, “Progress towards the 2020 targets for hiv diagnosis and antiretroviral treatment in south africa,” Southern African Journal of HIV Medicine, vol. 18, no. 1, p. 8, 2017, doi: 10.4102/sajhivmed.v18i1.694. [Online]. Available: https://sajhivmed.org.za/index.php/hivmed/article/view/694

[23] D. Bradshaw, R. Laubscher, R. Dorrington, P. Groenewald, and T. Moultrie, “Report on weekly deaths in South Africa 1 January - 23 June 2020 (Week 25),” Burden of Disease Research Unit, South African Medical Research Council [Online]. Available: https://www.samrc.ac.za/sites/default/files/files/2020-07-01/WeeklyDeaths23June2020_0.pdf

[24] D. Bradshaw, R. Laubscher, R. Dorrington, P. Groenewald, and T. Moultrie, “Report on weekly deaths in South Africa 1 January - 30 June 2020 (Week 26),” Burden of Disease Research Unit, South African Medical Research Council [Online]. Available: https://www.samrc.ac.za/sites/default/files/files/2020-07-08/WeeklyDeaths30June2020.pdf

[25] D. Bradshaw, R. Laubscher, R. Dorrington, P. Groenewald, and T. Moultrie, “Report on weekly deaths in South Africa 1 January - 14 July 2020 (Week 28),” Burden of Disease Research Unit, South African Medical Research Council [Online]. Available: https://www.samrc.ac.za/sites/default/files/files/2020-07-01/WeeklyDeaths23June2020_0.pdf